Goto

Collaborating Authors

 pac-bayesian theory meet bayesian inference


PAC-Bayesian Theory Meets Bayesian Inference

Neural Information Processing Systems

That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam's razor criteria, under the assumption that the data is generated by an i.i.d.


Reviews: PAC-Bayesian Theory Meets Bayesian Inference

Neural Information Processing Systems

The paper is well written and theoretically strong. It's been conjectured in the past that there should be links between PAC-Bayes theory and Bayesian inference, but to my knowledge this is the first theoretically complete demonstration of such links. Some comments: - In eq(8) (and above) the notion of a prior with bounded likelihood is introduced. Am I right in thinking that this is a data-dependent prior, since it can only be known if the likelihood will be bounded for a given prior after observing the data? If this is not the case can you explain how such a prior is possible?


PAC-Bayesian Theory Meets Bayesian Inference

Neural Information Processing Systems

That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam's razor criteria, under the assumption that the data is generated by an i.i.d. Moreover, as the negative log-likelihood is an unbounded loss function, we motivate and propose a PAC-Bayesian theorem tailored for the sub-gamma loss family, and we show that our approach is sound on classical Bayesian linear regression tasks. Papers published at the Neural Information Processing Systems Conference.